Overview
Brought to you by YData
Dataset statistics
| Number of variables | 19 |
|---|---|
| Number of observations | 19768 |
| Missing cells | 10929 |
| Missing cells (%) | 2.9% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 2.1 MiB |
| Average record size in memory | 110.0 B |
Variable types
| Categorical | 1 |
|---|---|
| Boolean | 6 |
| Text | 1 |
| Numeric | 9 |
| DateTime | 2 |
label is highly imbalanced (67.2%) | Imbalance |
type is highly imbalanced (92.8%) | Imbalance |
site_admin is highly imbalanced (95.8%) | Imbalance |
log_public_repos has 942 (4.8%) infinite values | Infinite |
log_public_gists has 7961 (40.3%) infinite values | Infinite |
log_followers has 1445 (7.3%) infinite values | Infinite |
log_following has 6017 (30.4%) infinite values | Infinite |
bio has 10929 (55.3%) missing values | Missing |
public_repos is highly skewed (γ1 = 53.8847472) | Skewed |
public_gists is highly skewed (γ1 = 74.09063706) | Skewed |
followers is highly skewed (γ1 = 32.46602776) | Skewed |
following is highly skewed (γ1 = 39.87415424) | Skewed |
public_repos has 942 (4.8%) zeros | Zeros |
public_gists has 7961 (40.3%) zeros | Zeros |
followers has 1445 (7.3%) zeros | Zeros |
following has 6017 (30.4%) zeros | Zeros |
text_bot_count has 19003 (96.1%) zeros | Zeros |
log_public_repos has 551 (2.8%) zeros | Zeros |
log_public_gists has 1873 (9.5%) zeros | Zeros |
log_followers has 803 (4.1%) zeros | Zeros |
log_following has 1734 (8.8%) zeros | Zeros |
Reproduction
| Analysis started | 2024-11-19 15:26:17.036550 |
|---|---|
| Analysis finished | 2024-11-19 15:26:27.304061 |
| Duration | 10.27 seconds |
| Software version | ydata-profiling vv4.12.0 |
| Download configuration | config.json |
Variables
label
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 154.6 KiB |
| Human | |
|---|---|
| Bot | 1190 |
Length
| Max length | 5 |
|---|---|
| Median length | 5 |
| Mean length | 4.8796034 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Human |
|---|---|
| 2nd row | Human |
| 3rd row | Human |
| 4th row | Bot |
| 5th row | Human |
Common Values
| Value | Count | Frequency (%) |
| Human | 18578 | |
| Bot | 1190 | 6.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| human | 18578 | |
| bot | 1190 | 6.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| H | 18578 | |
| u | 18578 | |
| m | 18578 | |
| a | 18578 | |
| n | 18578 | |
| B | 1190 | 1.2% |
| o | 1190 | 1.2% |
| t | 1190 | 1.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 96460 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| H | 18578 | |
| u | 18578 | |
| m | 18578 | |
| a | 18578 | |
| n | 18578 | |
| B | 1190 | 1.2% |
| o | 1190 | 1.2% |
| t | 1190 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 96460 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| H | 18578 | |
| u | 18578 | |
| m | 18578 | |
| a | 18578 | |
| n | 18578 | |
| B | 1190 | 1.2% |
| o | 1190 | 1.2% |
| t | 1190 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 96460 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| H | 18578 | |
| u | 18578 | |
| m | 18578 | |
| a | 18578 | |
| n | 18578 | |
| B | 1190 | 1.2% |
| o | 1190 | 1.2% |
| t | 1190 | 1.2% |
type
Boolean
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.4 KiB |
| True | |
|---|---|
| False | 171 |
| Value | Count | Frequency (%) |
| True | 19597 | |
| False | 171 | 0.9% |
site_admin
Boolean
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 19.4 KiB |
| False | |
|---|---|
| True | 90 |
| Value | Count | Frequency (%) |
| False | 19678 | |
| True | 90 | 0.5% |
| Value | Count | Frequency (%) |
| True | 10794 | |
| False | 8974 |
| Value | Count | Frequency (%) |
| False | 11256 | |
| True | 8512 |
| Value | Count | Frequency (%) |
| True | 12691 | |
| False | 7077 |
| Value | Count | Frequency (%) |
| False | 16470 | |
| True | 3298 | 16.7% |
bio
Text
Missing 
| Distinct | 8641 |
|---|---|
| Distinct (%) | 97.8% |
| Missing | 10929 |
| Missing (%) | 55.3% |
| Memory size | 154.6 KiB |
Length
| Max length | 160 |
|---|---|
| Median length | 116 |
| Mean length | 61.460459 |
| Min length | 1 |
Unique
| Unique | 8574 ? |
|---|---|
| Unique (%) | 97.0% |
Sample
| 1st row | I just press the buttons randomly, and the program evolves... |
|---|---|
| 2nd row | Time is unimportant, only life important. |
| 3rd row | Done studying. Need challenges. |
| 4th row | Administrator of MOONGIFT that is introducing open source software everyday to Japanese engineers since 2004. |
| 5th row | Senior Software Engineer at Google, working on Certificate Transparency and generalized transparency. |
| Value | Count | Frequency (%) |
| 3069 | 3.9% | |
| and | 2526 | 3.2% |
| engineer | 1583 | 2.0% |
| software | 1521 | 1.9% |
| of | 1488 | 1.9% |
| at | 1380 | 1.8% |
| developer | 1236 | 1.6% |
| the | 1086 | 1.4% |
| a | 1038 | 1.3% |
| i | 1033 | 1.3% |
| Other values (14754) | 62407 |
Most occurring characters
| Value | Count | Frequency (%) |
| 70014 | 12.9% | |
| e | 49589 | 9.1% |
| o | 32360 | 6.0% |
| n | 31402 | 5.8% |
| a | 31366 | 5.8% |
| t | 31195 | 5.7% |
| r | 31181 | 5.7% |
| i | 28526 | 5.3% |
| s | 19655 | 3.6% |
| l | 14767 | 2.7% |
| Other values (1736) | 203194 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 543249 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 70014 | 12.9% | |
| e | 49589 | 9.1% |
| o | 32360 | 6.0% |
| n | 31402 | 5.8% |
| a | 31366 | 5.8% |
| t | 31195 | 5.7% |
| r | 31181 | 5.7% |
| i | 28526 | 5.3% |
| s | 19655 | 3.6% |
| l | 14767 | 2.7% |
| Other values (1736) | 203194 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 543249 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 70014 | 12.9% | |
| e | 49589 | 9.1% |
| o | 32360 | 6.0% |
| n | 31402 | 5.8% |
| a | 31366 | 5.8% |
| t | 31195 | 5.7% |
| r | 31181 | 5.7% |
| i | 28526 | 5.3% |
| s | 19655 | 3.6% |
| l | 14767 | 2.7% |
| Other values (1736) | 203194 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 543249 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 70014 | 12.9% | |
| e | 49589 | 9.1% |
| o | 32360 | 6.0% |
| n | 31402 | 5.8% |
| a | 31366 | 5.8% |
| t | 31195 | 5.7% |
| r | 31181 | 5.7% |
| i | 28526 | 5.3% |
| s | 19655 | 3.6% |
| l | 14767 | 2.7% |
| Other values (1736) | 203194 |
public_repos
Real number (ℝ)
Skewed  Zeros 
| Distinct | 674 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 84.139215 |
| Minimum | 0 |
|---|---|
| Maximum | 50000 |
| Zeros | 942 |
| Zeros (%) | 4.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 11 |
| median | 35 |
| Q3 | 83 |
| 95-th percentile | 250 |
| Maximum | 50000 |
| Range | 50000 |
| Interquartile range (IQR) | 72 |
Descriptive statistics
| Standard deviation | 574.75022 |
|---|---|
| Coefficient of variation (CV) | 6.8309434 |
| Kurtosis | 3700.1203 |
| Mean | 84.139215 |
| Median Absolute Deviation (MAD) | 29 |
| Skewness | 53.884747 |
| Sum | 1663264 |
| Variance | 330337.81 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 942 | 4.8% |
| 1 | 551 | 2.8% |
| 2 | 465 | 2.4% |
| 3 | 396 | 2.0% |
| 4 | 380 | 1.9% |
| 6 | 364 | 1.8% |
| 5 | 357 | 1.8% |
| 7 | 330 | 1.7% |
| 9 | 312 | 1.6% |
| 8 | 307 | 1.6% |
| Other values (664) | 15364 |
| Value | Count | Frequency (%) |
| 0 | 942 | |
| 1 | 551 | |
| 2 | 465 | |
| 3 | 396 | |
| 4 | 380 | |
| 5 | 357 | 1.8% |
| 6 | 364 | 1.8% |
| 7 | 330 | 1.7% |
| 8 | 307 | 1.6% |
| 9 | 312 | 1.6% |
| Value | Count | Frequency (%) |
| 50000 | 1 | |
| 27746 | 1 | |
| 26360 | 1 | |
| 22618 | 1 | |
| 20693 | 1 | |
| 17425 | 1 | |
| 16985 | 1 | |
| 16839 | 1 | |
| 9666 | 1 | |
| 9554 | 1 |
public_gists
Real number (ℝ)
Skewed  Zeros 
| Distinct | 359 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 25.214083 |
| Minimum | 0 |
|---|---|
| Maximum | 55781 |
| Zeros | 7961 |
| Zeros (%) | 40.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 2 |
| Q3 | 10 |
| 95-th percentile | 66 |
| Maximum | 55781 |
| Range | 55781 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 635.69014 |
|---|---|
| Coefficient of variation (CV) | 25.211709 |
| Kurtosis | 5955.7935 |
| Mean | 25.214083 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 74.090637 |
| Sum | 498432 |
| Variance | 404101.96 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 7961 | |
| 1 | 1873 | 9.5% |
| 2 | 1152 | 5.8% |
| 3 | 823 | 4.2% |
| 4 | 665 | 3.4% |
| 5 | 627 | 3.2% |
| 6 | 488 | 2.5% |
| 7 | 405 | 2.0% |
| 9 | 327 | 1.7% |
| 8 | 318 | 1.6% |
| Other values (349) | 5129 |
| Value | Count | Frequency (%) |
| 0 | 7961 | |
| 1 | 1873 | 9.5% |
| 2 | 1152 | 5.8% |
| 3 | 823 | 4.2% |
| 4 | 665 | 3.4% |
| 5 | 627 | 3.2% |
| 6 | 488 | 2.5% |
| 7 | 405 | 2.0% |
| 8 | 318 | 1.6% |
| 9 | 327 | 1.7% |
| Value | Count | Frequency (%) |
| 55781 | 1 | |
| 53660 | 1 | |
| 28943 | 1 | |
| 26879 | 1 | |
| 15482 | 1 | |
| 10604 | 1 | |
| 3450 | 1 | |
| 3170 | 1 | |
| 2565 | 1 | |
| 1750 | 1 |
followers
Real number (ℝ)
Skewed  Zeros 
| Distinct | 1598 |
|---|---|
| Distinct (%) | 8.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 245.49702 |
| Minimum | 0 |
|---|---|
| Maximum | 95752 |
| Zeros | 1445 |
| Zeros (%) | 7.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 7 |
| median | 33 |
| Q3 | 125 |
| 95-th percentile | 836 |
| Maximum | 95752 |
| Range | 95752 |
| Interquartile range (IQR) | 118 |
Descriptive statistics
| Standard deviation | 1535.94 |
|---|---|
| Coefficient of variation (CV) | 6.2564506 |
| Kurtosis | 1570.3008 |
| Mean | 245.49702 |
| Median Absolute Deviation (MAD) | 31 |
| Skewness | 32.466028 |
| Sum | 4852985 |
| Variance | 2359111.6 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1445 | 7.3% |
| 1 | 803 | 4.1% |
| 2 | 623 | 3.2% |
| 3 | 515 | 2.6% |
| 4 | 450 | 2.3% |
| 5 | 415 | 2.1% |
| 6 | 396 | 2.0% |
| 7 | 347 | 1.8% |
| 8 | 338 | 1.7% |
| 9 | 311 | 1.6% |
| Other values (1588) | 14125 |
| Value | Count | Frequency (%) |
| 0 | 1445 | |
| 1 | 803 | |
| 2 | 623 | |
| 3 | 515 | 2.6% |
| 4 | 450 | 2.3% |
| 5 | 415 | 2.1% |
| 6 | 396 | 2.0% |
| 7 | 347 | 1.8% |
| 8 | 338 | 1.7% |
| 9 | 311 | 1.6% |
| Value | Count | Frequency (%) |
| 95752 | 1 | |
| 84979 | 1 | |
| 66203 | 1 | |
| 58452 | 1 | |
| 31120 | 1 | |
| 30287 | 1 | |
| 29719 | 1 | |
| 29414 | 1 | |
| 28411 | 1 | |
| 25815 | 1 |
following
Real number (ℝ)
Skewed  Zeros 
| Distinct | 620 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 44.520741 |
| Minimum | 0 |
|---|---|
| Maximum | 27775 |
| Zeros | 6017 |
| Zeros (%) | 30.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 4 |
| Q3 | 22 |
| 95-th percentile | 148 |
| Maximum | 27775 |
| Range | 27775 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 366.79344 |
|---|---|
| Coefficient of variation (CV) | 8.2387093 |
| Kurtosis | 2260.6155 |
| Mean | 44.520741 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | 39.874154 |
| Sum | 880086 |
| Variance | 134537.43 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 6017 | |
| 1 | 1734 | 8.8% |
| 2 | 1092 | 5.5% |
| 3 | 794 | 4.0% |
| 4 | 602 | 3.0% |
| 5 | 533 | 2.7% |
| 6 | 484 | 2.4% |
| 7 | 407 | 2.1% |
| 8 | 368 | 1.9% |
| 9 | 322 | 1.6% |
| Other values (610) | 7415 |
| Value | Count | Frequency (%) |
| 0 | 6017 | |
| 1 | 1734 | 8.8% |
| 2 | 1092 | 5.5% |
| 3 | 794 | 4.0% |
| 4 | 602 | 3.0% |
| 5 | 533 | 2.7% |
| 6 | 484 | 2.4% |
| 7 | 407 | 2.1% |
| 8 | 368 | 1.9% |
| 9 | 322 | 1.6% |
| Value | Count | Frequency (%) |
| 27775 | 1 | |
| 16741 | 1 | |
| 15931 | 1 | |
| 11921 | 1 | |
| 10268 | 1 | |
| 9720 | 1 | |
| 9686 | 1 | |
| 9532 | 1 | |
| 9367 | 1 | |
| 7374 | 1 |
created_at
Date
| Distinct | 19767 |
|---|---|
| Distinct (%) | > 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 154.6 KiB |
| Minimum | 2008-01-27 07:09:47+00:00 |
|---|---|
| Maximum | 2021-12-20 05:29:41+00:00 |
updated_at
Date
| Distinct | 19633 |
|---|---|
| Distinct (%) | 99.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 154.6 KiB |
| Minimum | 2016-08-08 22:18:09+00:00 |
|---|---|
| Maximum | 2023-10-14 14:33:48+00:00 |
text_bot_count
Real number (ℝ)
Zeros 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.061361797 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 19003 |
| Zeros (%) | 96.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.34100309 |
|---|---|
| Coefficient of variation (CV) | 5.5572539 |
| Kurtosis | 51.672415 |
| Mean | 0.061361797 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 6.674794 |
| Sum | 1213 |
| Variance | 0.11628311 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 19003 | |
| 1 | 425 | 2.1% |
| 2 | 251 | 1.3% |
| 3 | 75 | 0.4% |
| 4 | 9 | < 0.1% |
| 5 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 19003 | |
| 1 | 425 | 2.1% |
| 2 | 251 | 1.3% |
| 3 | 75 | 0.4% |
| 4 | 9 | < 0.1% |
| 5 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 5 | 5 | < 0.1% |
| 4 | 9 | < 0.1% |
| 3 | 75 | 0.4% |
| 2 | 251 | 1.3% |
| 1 | 425 | 2.1% |
| 0 | 19003 |
log_public_repos
Real number (ℝ)
Infinite  Zeros 
| Distinct | 674 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 942 |
| Infinite (%) | 4.8% |
| Mean | -inf |
| Minimum | -inf |
|---|---|
| Maximum | 10.819778 |
| Zeros | 551 |
| Zeros (%) | 2.8% |
| Negative | 942 |
| Negative (%) | 4.8% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | -inf |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2.3978953 |
| median | 3.5553481 |
| Q3 | 4.4188406 |
| 95-th percentile | 5.5214609 |
| Maximum | 10.819778 |
| Range | inf |
| Interquartile range (IQR) | 2.0209453 |
Descriptive statistics
| Standard deviation | nan |
|---|---|
| Coefficient of variation (CV) | nan |
| Kurtosis | nan |
| Mean | -inf |
| Median Absolute Deviation (MAD) | 0.96644052 |
| Skewness | nan |
| Sum | -inf |
| Variance | nan |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -inf | 942 | 4.8% |
| 0 | 551 | 2.8% |
| 0.6931471806 | 465 | 2.4% |
| 1.098612289 | 396 | 2.0% |
| 1.386294361 | 380 | 1.9% |
| 1.791759469 | 364 | 1.8% |
| 1.609437912 | 357 | 1.8% |
| 1.945910149 | 330 | 1.7% |
| 2.197224577 | 312 | 1.6% |
| 2.079441542 | 307 | 1.6% |
| Other values (664) | 15364 |
| Value | Count | Frequency (%) |
| -inf | 942 | |
| 0 | 551 | |
| 0.6931471806 | 465 | |
| 1.098612289 | 396 | |
| 1.386294361 | 380 | |
| 1.609437912 | 357 | 1.8% |
| 1.791759469 | 364 | 1.8% |
| 1.945910149 | 330 | 1.7% |
| 2.079441542 | 307 | 1.6% |
| 2.197224577 | 312 | 1.6% |
| Value | Count | Frequency (%) |
| 10.81977828 | 1 | |
| 10.23084696 | 1 | |
| 10.17960299 | 1 | |
| 10.02650133 | 1 | |
| 9.937550758 | 1 | |
| 9.765661236 | 1 | |
| 9.740085881 | 1 | |
| 9.731452904 | 1 | |
| 9.176369852 | 1 | |
| 9.164715194 | 1 |
log_public_gists
Real number (ℝ)
Infinite  Zeros 
| Distinct | 359 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 7961 |
| Infinite (%) | 40.3% |
| Mean | -inf |
| Minimum | -inf |
|---|---|
| Maximum | 10.929189 |
| Zeros | 1873 |
| Zeros (%) | 9.5% |
| Negative | 7961 |
| Negative (%) | 40.3% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | -inf |
|---|---|
| 5-th percentile | nan |
| Q1 | nan |
| median | 0.69314718 |
| Q3 | 2.3025851 |
| 95-th percentile | 4.1896547 |
| Maximum | 10.929189 |
| Range | inf |
| Interquartile range (IQR) | nan |
Descriptive statistics
| Standard deviation | nan |
|---|---|
| Coefficient of variation (CV) | nan |
| Kurtosis | nan |
| Mean | -inf |
| Median Absolute Deviation (MAD) | 2.8622009 |
| Skewness | nan |
| Sum | -inf |
| Variance | nan |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -inf | 7961 | |
| 0 | 1873 | 9.5% |
| 0.6931471806 | 1152 | 5.8% |
| 1.098612289 | 823 | 4.2% |
| 1.386294361 | 665 | 3.4% |
| 1.609437912 | 627 | 3.2% |
| 1.791759469 | 488 | 2.5% |
| 1.945910149 | 405 | 2.0% |
| 2.197224577 | 327 | 1.7% |
| 2.079441542 | 318 | 1.6% |
| Other values (349) | 5129 |
| Value | Count | Frequency (%) |
| -inf | 7961 | |
| 0 | 1873 | 9.5% |
| 0.6931471806 | 1152 | 5.8% |
| 1.098612289 | 823 | 4.2% |
| 1.386294361 | 665 | 3.4% |
| 1.609437912 | 627 | 3.2% |
| 1.791759469 | 488 | 2.5% |
| 1.945910149 | 405 | 2.0% |
| 2.079441542 | 318 | 1.6% |
| 2.197224577 | 327 | 1.7% |
| Value | Count | Frequency (%) |
| 10.92918859 | 1 | |
| 10.89042312 | 1 | |
| 10.27308366 | 1 | |
| 10.19910059 | 1 | |
| 9.647433338 | 1 | |
| 9.268986567 | 1 | |
| 8.14612951 | 1 | |
| 8.061486867 | 1 | |
| 7.849713758 | 1 | |
| 7.467371067 | 1 |
log_followers
Real number (ℝ)
Infinite  Zeros 
| Distinct | 1598 |
|---|---|
| Distinct (%) | 8.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 1445 |
| Infinite (%) | 7.3% |
| Mean | -inf |
| Minimum | -inf |
|---|---|
| Maximum | 11.469517 |
| Zeros | 803 |
| Zeros (%) | 4.1% |
| Negative | 1445 |
| Negative (%) | 7.3% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | -inf |
|---|---|
| 5-th percentile | nan |
| Q1 | 1.9459101 |
| median | 3.4965076 |
| Q3 | 4.8283137 |
| 95-th percentile | 6.7286286 |
| Maximum | 11.469517 |
| Range | inf |
| Interquartile range (IQR) | 2.8824036 |
Descriptive statistics
| Standard deviation | nan |
|---|---|
| Coefficient of variation (CV) | nan |
| Kurtosis | nan |
| Mean | -inf |
| Median Absolute Deviation (MAD) | 1.417066 |
| Skewness | nan |
| Sum | -inf |
| Variance | nan |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -inf | 1445 | 7.3% |
| 0 | 803 | 4.1% |
| 0.6931471806 | 623 | 3.2% |
| 1.098612289 | 515 | 2.6% |
| 1.386294361 | 450 | 2.3% |
| 1.609437912 | 415 | 2.1% |
| 1.791759469 | 396 | 2.0% |
| 1.945910149 | 347 | 1.8% |
| 2.079441542 | 338 | 1.7% |
| 2.197224577 | 311 | 1.6% |
| Other values (1588) | 14125 |
| Value | Count | Frequency (%) |
| -inf | 1445 | |
| 0 | 803 | |
| 0.6931471806 | 623 | |
| 1.098612289 | 515 | 2.6% |
| 1.386294361 | 450 | 2.3% |
| 1.609437912 | 415 | 2.1% |
| 1.791759469 | 396 | 2.0% |
| 1.945910149 | 347 | 1.8% |
| 2.079441542 | 338 | 1.7% |
| 2.197224577 | 311 | 1.6% |
| Value | Count | Frequency (%) |
| 11.46951679 | 1 | |
| 11.35015945 | 1 | |
| 11.10048106 | 1 | |
| 10.97596118 | 1 | |
| 10.34560598 | 1 | |
| 10.31847386 | 1 | |
| 10.29954185 | 1 | |
| 10.28922603 | 1 | |
| 10.25453167 | 1 | |
| 10.158711 | 1 |
log_following
Real number (ℝ)
Infinite  Zeros 
| Distinct | 620 |
|---|---|
| Distinct (%) | 3.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 6017 |
| Infinite (%) | 30.4% |
| Mean | -inf |
| Minimum | -inf |
|---|---|
| Maximum | 10.231892 |
| Zeros | 1734 |
| Zeros (%) | 8.8% |
| Negative | 6017 |
| Negative (%) | 30.4% |
| Memory size | 154.6 KiB |
Quantile statistics
| Minimum | -inf |
|---|---|
| 5-th percentile | nan |
| Q1 | nan |
| median | 1.3862944 |
| Q3 | 3.0910425 |
| 95-th percentile | 4.9972123 |
| Maximum | 10.231892 |
| Range | inf |
| Interquartile range (IQR) | nan |
Descriptive statistics
| Standard deviation | nan |
|---|---|
| Coefficient of variation (CV) | nan |
| Kurtosis | nan |
| Mean | -inf |
| Median Absolute Deviation (MAD) | 2.0794415 |
| Skewness | nan |
| Sum | -inf |
| Variance | nan |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -inf | 6017 | |
| 0 | 1734 | 8.8% |
| 0.6931471806 | 1092 | 5.5% |
| 1.098612289 | 794 | 4.0% |
| 1.386294361 | 602 | 3.0% |
| 1.609437912 | 533 | 2.7% |
| 1.791759469 | 484 | 2.4% |
| 1.945910149 | 407 | 2.1% |
| 2.079441542 | 368 | 1.9% |
| 2.197224577 | 322 | 1.6% |
| Other values (610) | 7415 |
| Value | Count | Frequency (%) |
| -inf | 6017 | |
| 0 | 1734 | 8.8% |
| 0.6931471806 | 1092 | 5.5% |
| 1.098612289 | 794 | 4.0% |
| 1.386294361 | 602 | 3.0% |
| 1.609437912 | 533 | 2.7% |
| 1.791759469 | 484 | 2.4% |
| 1.945910149 | 407 | 2.1% |
| 2.079441542 | 368 | 1.9% |
| 2.197224577 | 322 | 1.6% |
| Value | Count | Frequency (%) |
| 10.23189161 | 1 | |
| 9.725616079 | 1 | |
| 9.676022176 | 1 | |
| 9.38605683 | 1 | |
| 9.236787542 | 1 | |
| 9.181940897 | 1 | |
| 9.178436823 | 1 | |
| 9.162409838 | 1 | |
| 9.144948153 | 1 | |
| 8.905715579 | 1 |
Interactions
Missing values
Sample
| label | type | site_admin | company | blog | location | hireable | bio | public_repos | public_gists | followers | following | created_at | updated_at | text_bot_count | log_public_repos | log_public_gists | log_followers | log_following | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Human | True | False | False | False | False | False | NaN | 26 | 1 | 5 | 1 | 2011-09-26 17:27:03+00:00 | 2023-10-13 11:21:10+00:00 | 0 | 3.258097 | 0.000000 | 1.609438 | 0.000000 |
| 1 | Human | True | False | False | True | False | True | I just press the buttons randomly, and the program evolves... | 30 | 3 | 9 | 6 | 2015-06-29 10:12:46+00:00 | 2023-10-07 06:26:14+00:00 | 0 | 3.401197 | 1.098612 | 2.197225 | 1.791759 |
| 2 | Human | True | False | True | True | True | True | Time is unimportant,\nonly life important. | 103 | 49 | 1212 | 221 | 2008-08-29 16:20:03+00:00 | 2023-10-02 02:11:21+00:00 | 0 | 4.634729 | 3.891820 | 7.100027 | 5.398163 |
| 3 | Bot | True | False | False | False | True | False | NaN | 49 | 0 | 84 | 2 | 2014-05-20 18:43:09+00:00 | 2023-10-12 12:54:59+00:00 | 0 | 3.891820 | -inf | 4.430817 | 0.693147 |
| 4 | Human | True | False | False | False | False | True | NaN | 11 | 1 | 6 | 2 | 2012-08-16 14:19:13+00:00 | 2023-10-06 11:58:41+00:00 | 0 | 2.397895 | 0.000000 | 1.791759 | 0.693147 |
| 5 | Human | True | False | True | True | True | False | Done studying. Need challenges. | 56 | 1 | 22 | 7 | 2017-04-11 14:08:07+00:00 | 2023-10-11 05:59:26+00:00 | 0 | 4.025352 | 0.000000 | 3.091042 | 1.945910 |
| 6 | Human | True | False | True | True | True | True | Administrator of MOONGIFT that is introducing open source software everyday to Japanese engineers since 2004. | 277 | 1139 | 63 | 16 | 2008-04-07 22:22:22+00:00 | 2023-09-27 09:04:56+00:00 | 0 | 5.624018 | 7.037906 | 4.143135 | 2.772589 |
| 7 | Human | True | False | True | False | True | False | Senior Software Engineer at Google, working on Certificate Transparency and generalized transparency. | 37 | 1 | 22 | 0 | 2012-01-19 21:57:07+00:00 | 2023-08-07 16:06:34+00:00 | 0 | 3.610918 | 0.000000 | 3.091042 | -inf |
| 8 | Human | True | False | False | False | False | False | NaN | 27 | 2 | 37 | 596 | 2019-12-24 20:04:33+00:00 | 2023-10-12 11:55:01+00:00 | 0 | 3.295837 | 0.693147 | 3.610918 | 6.390241 |
| 9 | Human | True | False | True | True | True | False | Hi | 42 | 9 | 14 | 2 | 2013-07-23 23:29:34+00:00 | 2023-10-09 20:47:05+00:00 | 0 | 3.737670 | 2.197225 | 2.639057 | 0.693147 |
| label | type | site_admin | company | blog | location | hireable | bio | public_repos | public_gists | followers | following | created_at | updated_at | text_bot_count | log_public_repos | log_public_gists | log_followers | log_following | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19758 | Human | True | False | True | False | True | False | NaN | 30 | 0 | 10 | 11 | 2016-09-10 09:45:00+00:00 | 2023-10-06 11:30:51+00:00 | 0 | 3.401197 | -inf | 2.302585 | 2.397895 |
| 19759 | Human | True | False | False | False | True | True | NaN | 37 | 19 | 91 | 6 | 2012-04-19 03:27:14+00:00 | 2023-10-07 18:13:52+00:00 | 0 | 3.610918 | 2.944439 | 4.510860 | 1.791759 |
| 19760 | Bot | True | False | False | False | False | False | I am the bot account of @alvaroaleman | 1 | 0 | 0 | 0 | 2018-12-15 19:55:31+00:00 | 2021-07-27 14:14:25+00:00 | 2 | 0.000000 | -inf | -inf | -inf |
| 19761 | Human | True | False | False | False | False | False | NaN | 3 | 0 | 1 | 0 | 2013-11-10 16:05:37+00:00 | 2023-08-31 14:26:08+00:00 | 2 | 1.098612 | -inf | 0.000000 | -inf |
| 19762 | Human | True | False | False | False | False | False | NaN | 0 | 0 | 0 | 0 | 2020-10-01 18:30:32+00:00 | 2020-12-29 19:45:12+00:00 | 0 | -inf | -inf | -inf | -inf |
| 19763 | Bot | True | False | True | True | True | False | Tony came to Linux in 1994 and has never looked back. His entire professional career has been spent working with or on Linux. First as a systems administrator | 36 | 16 | 11 | 4 | 2014-07-02 23:27:34+00:00 | 2023-08-15 16:38:34+00:00 | 0 | 3.583519 | 2.772589 | 2.397895 | 1.386294 |
| 19764 | Human | True | False | False | False | False | False | NaN | 16 | 0 | 3 | 0 | 2017-12-06 21:56:31+00:00 | 2023-07-26 18:32:25+00:00 | 0 | 2.772589 | -inf | 1.098612 | -inf |
| 19765 | Human | True | False | True | False | True | False | Software engineer at RealTracs. | 13 | 0 | 10 | 1 | 2015-11-14 14:44:05+00:00 | 2022-08-23 21:09:49+00:00 | 0 | 2.564949 | -inf | 2.302585 | 0.000000 |
| 19766 | Human | True | False | True | False | False | False | NaN | 7 | 0 | 2 | 0 | 2021-11-23 18:55:29+00:00 | 2023-10-06 22:50:45+00:00 | 0 | 1.945910 | -inf | 0.693147 | -inf |
| 19767 | Bot | True | False | False | False | True | False | NaN | 10 | 0 | 1 | 0 | 2016-04-22 22:11:59+00:00 | 2022-07-07 19:48:21+00:00 | 0 | 2.302585 | -inf | 0.000000 | -inf |